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Abstract 

In this article we discuss estimation of the common variance of sev- 
eral normal populations with tree order restricted means. We discuss the 
asymptotic properties of the maximum likelihood estimator of the vari- 
ance as the number of populations tends to infinity. We consider several 
cases of various orders of the sample sizes and show that the maximum 
likelihood estimator of the variance may or may not be consistent or be 
asymptotically normal. 

1 Introduction 

1.1 Background and Motivation 

Tree order restrictions arise naturally in many important applications. 
One classical scenario is the comparison of several, say s, treatments with 
a known control or a placebo treatment. It is then natural to model the 
effect of the i^^ treatment, say /ii to be at least as large as the effect of the 
control treatment denoted by jJo, that is, /io < fii, for all i = 1, 2, . . . , s. 

Under the tree order restriction the parameter space for fi := {no; fii, 
fi2, ■ ■ ■ , fJ-s) forms a symmetric polyhedral cone in R^+^ with its spine along 
the line no — ni — ■ ■ ■ — fis- It can be shown that under the normality 
assumption the constrained maximum likelihood estimator (MLE) of the 
mean vector /i is biased in many situations. In fact in 7] it was shown that 
if /ii's remain bounded and the sample sizes n'^'s remain bounded then 
the bias for no diverges to — oo as s — >■ oo. Because of this phenomenon 
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the constrained MLEs have been criticised severely in Hterature. Hwang 
and Peddada i6|| wrote that the MLE "fails disastrously" and Cohen and 
Sackrowitz Q remarked that the MLE is "undesirable". 

Under certain conditions however, no is unbiased. It was shown by 
Chaudhuri and Perlman Q] that if either /t(i) — mini>i fj,i or rig"' 
grow sufficiently fast, then the MLE of no can be bounded from below in 
probability and it may even be consistent. 

In all earlier works it is generally assumed that the treatment groups 
are homoscedastic but none considered estimation of the variance a^. In 
this article we discuss maximum likelihood estimation of under normal- 
ity assumption with the tree order restriction on the population means. 
We consider the asymptotic properties of this estimator as s — > oo, that 
is, in the limit the dimension of jj, becomes large. We show that, depend- 
ing on the growth of n'"^ with s the MLE ct^^j is consistent under mild 
conditions, even though in some of these cases /i'"' is not consistent. Un- 
der stricter assumptions we also prove asymptotically normality for the 
estimator (^f^y 

1.2 Constrained Mciximum Likelihood Estimator 
of fi and 0"^ 

Suppose there are s + 1 independent populations indexed by 0, 1,2, . . ., 
s with unknown means {fii)i>o and unknown common variance a^. Let 
Xn, Xi2, . . ., X_ (s) , be an i.i.d. sample from the population. 

Under the tree order restriction and the assumptions made above, the 
maximum likelihood estimator of /t is given by (see [gI. [lo|) 



/in — mm r—^ s and (1) 

Al^'^ma^U^',^'^') . (2) 



Note that, equations ([T]) and ([2]) also give the least squared estimators of 
the mean vector for any distribution. 

Under the tree order restriction the constrained maximum likelihood 
estimator of is given by 
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j = l i = l j = l 



1.3 Main Results 

We make the following assumptions throughout this article: 

Al: The mean vector /i is tree order restriction, that is, no < /ii for all 
I > 1. 

A2: There exists B > such that /ii < B for all i > 0. 
A3: The populations are Normal. 
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For simplicity we further assume that 

A4: rij"' = rij"' — . . . = tj'^' — j^i'') and both Wq"' and n'^' are non- 
decreasing in s. 

Our main interest is to study the asymptotic properties of the maxi- 
mum likelihood estimator of namely a^^^j as the number of populations 
becomes large. 

Let A*''"' = X]i=o '^i"' total sample size and we write X^"'' : — 

1 ^^^^ th 
—rp- X]i=i foi' the sample mean of the i population where 1 < i < s. 

We first consider an example with two populations, with tree-ordered 
means. The size of the sample drawn from the population with larger 
mean increases linearly with s, while size of the placebo sample remains 
constant. 

Theorem 1. Consider two populations with a common variance and 
means jj.o < j-ii- Let Ug"^ — m and Wj*' — m's, where m, m' > 1 are two 
fixed integers. Then as s oo 

^fs) > cr^ a.s. (4) 

Moreover, 

^/iVW(4)-^2) ;v(0,2a^) . (5) 

Note that as pointed out in Section [3] it is not difficult to show that 

(s) 2 

/tg is biased in this case, yet aj^j is consistent and a CLT holds. 

The assumptions of Theorem [1] can be interpreted in the following 
alternative way. Suppose we consider s + 1 populations with an unknown 
common variance and j-L\ = I-L2 = ■ ■ ■ = fJ-s with ^0 < Mi • Both /io 
and /ii are unknown. Let Wq*' — m and n'"' = m' be the sample sizes 
from these distributions. Clearly the MLEs of /io, Mi and are exactly 
same as in Theorem [1] So it follows that in the limit as s — >■ 00, the MLE 
iT^g) is consistent and admits a CLT. It is worth mentioning here that the 
assumption that /ii = /i2 = ■ • • = /is is very crucial for the consistency 
and also for the CLT. This is because as stated in Theorem [4] below the 
consistency may fail and the simulations presented in Section [2] shows that 
CLT may not hold either. 

Our next theorem deals with the case when the total sample size 
grows at a faster rate than s. 

Theorem 2. Suppose N^"^ — > 00 then under the assumptions Al — A4 
P ( < limsupCT^^) < (7^ j = 1 . (6) 

\ S^OO / 

Further if we assume that s/7V(") — > as s — )■ 00 then 

-2 ^2 

while if s/VN^ — > then 

Vm^{al^~a') ^ iV (0,2(7^) . 
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From the proof of Theorem [2] in Section [3] we see that the result holds 
for any constrain on the mean vector fi which need not be just the tree 
order restriction. So this result may be used in other constrained prob- 
lems, for example, in the study of the isotonic regression model, where 
one assumes that /io < Mi M2 < • • • < A's- 

Following result is immediate from Theorem [2] which covers the case 
when n''"^ = m fixed but tIq"' is increasing at an appropriate rate. 

Theorem 3. Let n\^^ be such that s/u'q ^ — > as s —>■ oo. Then un- 
der the assumptions Al — A4 ct^^j is strongly consistent. Moreover, if 

^1 \J '^0°' — ^ as s —)■ oo then, 

Vm^{al^~a^) ^ N{0,2a^). 

It is worth noting that under the conditions in Theorem [3] the MLE 
/iq"' is consistent Q]. 

From Theorem [2] the strong consistency holds if either Ug"^ / s — > oo 
or n*-"-* — > oo as s — >■ oo. For the CLT to hold we need stronger condition, 
namely, ng^'/s^ — > oo or n'^"^ / s — > oo as s — >■ oo. In particular it covers 
the case when n^^^/logs — > oo for which /jq"'' is consistent if and only 
if riQ^' — >■ 00 (see Proposition [1] in Section |4]). In this case we have not 
been able to proof the CLT, which may hold (see Section [2] for simulated 
results) . 

Following theorem deals with the case when both n[^^ and n^"^ remain 
bounded. 

Theorem 4. Suppose np"' = n'-'^^ = m for some fixed m > 2. Then under 
the assumptions Al — A4 

.2^-12 

(j/s) — > a a.s. (7) 

Like in Theorem[T]in this case also A*''^^ ~ ms but o-^^) is not consistent. 
The difference is in the dimension of the mean parameter being estimated. 
In Theorem [T] it is exactly 2, however in Theorem |4] it grows unbounded 
with s. Once again in this case the CLT may not hold and we present 
some simulation results in Section [2l 

The apparent ambiguity between Theorems [T] and [4] is reminiscent of 
the so called Neyman-Scott example They considered i.i.d. samples 
of equal finite size from several normal populations with unknown means 
and common variance. It was shown that in the limit if number of pop- 
ulations increases the MLE of the common variance is inconsistent. Here 
we observe the same phenomenon with tree order restriction on /i. For 
estimation of ct^, /i is a nuisance parameter. In Theorem [1] fit — fii, for 
all i > 1. Thus even in the limit of s — > oo. The number of nuisance 
parameters remain bounded. In contrast, in Theorem 3] this number in- 
creases unbounded. This explains the inconsistency of a^^j in the latter. 
Chaudhuri and Perlman [l|] argue that in general, if the number of rmi- 
sance parameters is allowed to grow, the MLE of the parameter of interest 
may not be consistent. It even may not converge to any limit. 
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Figure 1: Box plot (Figure 1(a) I and histogram for s ~ 10000 (Figure 1(b) I of 
(xf."^ - ') when n^"^ = n^^) = m. 



i=0 'H 



One important case which is not covered by the results described above 
is when ng"' = 0(s) and n'"' remains bounded. Chaudhuri and Perlman 
show that, in this case /jq remains bounded from below with high 
probability, but may not be consistent. In Section [2] we present some 
simulation results for this case. 

1.4 Outline 

The rest of the article is structured as follows. The next section gives 
the detailed simulation results of the cases mentioned above. In Section 
[3] we present the proofs of the main results. Section |4] contains some of 
the technical results and their proofs which we are used to prove the main 
results. 



2 Simulation Studies for Some Unresolved 
Cases 

In this section we perform a simulation study to explore the asymptotic 
behaviour of a^^-j under some conditions which are not covered by the 
results in Section [1.31 In particular, we consider the following three situ- 
ations: 

1. We check if the CLT holds when n'^"^ = n*^' = m. Theorem g] we 
know that a'^^-^ is inconsistent for in this case. 

2. We check if CLT holds when n'^"^ = n'-"^ = (logs)^. Notice that in 
this case, A^'°' = (s + l)(logs)^ and from Theorem[2]it follows that 
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Figure 2: Box plot (Figure 2(a) I and histogram for s = 10000 (Figure 2(b) I of 
when TT-Q^-* = n*^*) ~ (logs)'^. 



a^g-j is consistent for in this case. However, s/VTV^ 76- 0, so the 
CLT cannot be derived from Theorem [21 

3. Asymptotic behaviour of d-^s)y when 12^"^ = 0{s) and n'°' remains 
bounded is not covered by any result considered above. In this case 
even the consistency of Jlq"'' is not known, though it is bounded below 
with high probability [3]. 
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Figure 3: Box plot of (s/vN^ = h 



I4 when Uq 



is) 



0{s) and n^^^ — m. 



In the simulation study for simplicity we assume all Hi, i > to 
be equal, which is equivalent to assuming /i — (0, 0, . . . , 0). We assume 

= 1. In order to study the asymptotic behaviour for large values of s, 
we consider s = 10, 50, 100, 500, 1000, 5000, 10000. The presented resuhs 
are based on 2500 repetitions for each population size s. 
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Figure 4: Box plot (Figure 4(a) I and histogram for s = 10000 (Figure [4(b)| of 
when tt.q''^ = 0(s) and n'^'*^ — m 



Since, under all the above conditions the term Ii + I3 in 



has a 



finite limit (in probability) and V A'"'"' (/i +73) converges in distribu- 
tion, we concentrate on the random variable — V A^(=) {I2 + I4) = 



In the figures we present box and whisker plots of for each case. We 
also present the histogram of for s — 10000 in all cases. The top and 
the bottom of the boxes in the box whisker plots represent the first and 
third quartiles respectively. The line inside the box is the median. The 
whiskers were drawn to represent the most extreme data point still within 
the 1.5 times the inter quartile range of the first and the third quartiles. 
For better representation we have omitted more extreme points. 

When n'"' = n*"' = m from Figure |l(a)| it is clear that the spread 
of the distribution reduces with s. Further, it is seen that the medians 
of the distributions have a decreasing trend with s. However, we cannot 
conclude that converges to in probability because the histogram in 



Figure 1(b) does not show any concentration near 0. 

The case when rig"^ = n'*' — (logs)^ is presented in Figures 2(a) and 



2(b) From Figure 



2(a) it seems that ^3 converges to in probability. The 
2(b) I also seems to be quite concentrated near 0. So a 



histogram (Figure 
CLT may hold in this case. 

The situation when Wq^' — 0(s) and n'"' = 100 shows a different 
picture. As seen from the box plot in Figure [S] ^s/viVf^ = I2 + I4, seems 
to be rapidly converging to in probability, which indicates a?^-, may be 



consistent. However, the box plot in Figure 4(a) and the histogram Figure 



4(b) indicates that asymptotically may not be converging to 0, thus the 



CLT may not hold. It is interesting to note that the histogram of in 



Figure 4(b) is almost symmetric around its mean, for which we do not 



have an intuitive explanation. 
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3 Proofs of the Main Results 

We start by observing that by Lemma [T] in Section 3] 

o-Js) = h+ h + h + h, (8) 

where 

(si 

(f) 



(9) 

3.1 Proof of Theorem [It 

We have two populations and A'^'"' — m + m's. Since rig''' = m and 
n[''' = m's, m/N^"'' and m's/N^"^ -5> 1 as s oo. Further X^^'^ does 
not depend on s and Xij are i.i.d random variables for j = 1,2, . . . , m's, 
with E (Xii) = ^ii. So by the SLLN 



iV('') 



/ii a.s. 



Now using the fact that Xq^' does not depend on s, it follows that: 

fl^^^ mm (Xl,':\ 111) a.s. (10) 



/tj^^' — >■ max ^min ^Xq°' , /ii j = Mi ^--S- (H) 

Notice that in this case Ii = X]j=i (^-^oj ^ ^'^'^ ^o!*' does 
not depend on s so 

lim Ji = 0. (12) 



Also h ~ (^^a ' ^ Ao^') 1 so by equation (fTO)) we get 

lim /a = 0. (13) 

s— >oo 

Further /4 = ^-^i!*' ~ Ai*') i so using equation (|11[) we get 

lim /4 = 0. (14) 

S— >oo 

Now observe that by standard SLLN 

^3 = ]v(:yE(^i^ -^-') (15) 

So collecting the terms in ([8} we get ct^^j — > a.s. proving the strong 
consistency. 
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To show the asymptotic normahty we consider VWM (ct^j,) — a^) . Thus 
by similar argument as in equations (|12[1 and (I13|l prove that 

lim ViV(=)/i = = lim VNi^'ih . (16) 

s— >-oo s-^oo 

Further note that 
Xi'^ ~fi['^ ^min (0,XW - A';') 

= mm 0, a) ' — mm aA , — 

V 1' 10 ' jyi^m'S 



— mm 



m 



^^'-^l^')h,i^><,m- (17) 



So it follows that 



— ^ a.s. (18) 
Now using the standard CLT we get 



^/iVW [I, a') ^ I ^ ^ (Xi, - -a'\ ^ N (O, 2a* 

(19) 



iV(s) 



Finally using equations p6p . psp and (|19|l we conclude that 



VW) {al^ - a") N (0, 2a*) . 

□ 

3.2 Proof of Theorem [2 

We start by noting that since a^^j is the least squared estimator of a^ so 

j^EE(^--^0\ (20) 

i=0 j = l 

provided that fj, satisfies the required tree order restriction. Here we note 
that specific constraint such as the tree order restriction is not needed to 
claim equation (|20|l . it will hold for any general constraint under which 
least square estimator is obtained as long as fj. satisfies it. 
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Now it follows that 



s-fs) <<yls) =■■ ^ (^'j " (21) 

i=0 j = l 



■ — i=0 j=l 

Since Xij — fit are i.i.d. N (0,(t^), so by SLLN the first assertion of the 
theorem follows. Furthermore, from the fundamental decomposition, ((8]) 
and ((9]) we get 

< ^ [/. + /4] < ^ E (4=^ - M.) ' • (22) 

1=0 

Note that ^/n^ - ^ii) ~ N (O, cr^) , which implies n^"' - /ii) ^ ' 

cr^X?- Thus from the SLLN it follows that (s+1)"^ ELo f-^f"' - M 



By assumption (s + 1)/Af*'' — ^ 0, thus 

^ ]^ + ^ w ITT ^ i^'^' -^^y^' ^-^^ 

1—0 

Moreover, 

< - '^c^) - W7TT E -f"')'-!^ + /4] ^ a.s. 

Now using the SLLN as before ct^^j — >■ ct^ almost surely. So ct^^j is strongly 
consistent. 

Now we assume that s/V N^"") -> 0. To prove the CLT we observe that 
ViVW (a^,) - cr^) = (/i + Ja - a^) + \/iV(^ (/a + /4) . 

Now from ^ 



i=0 



Thus, using similar argument as above we get s/VN^ ^ =^ VMM (/2 + J4) - 
a.s. 

Let us now denote Y, = ELo E^ii ( ^ ) • Note that E,=i [X,, - X 
are i.i.d. f^x^(s) ^ distributed random variables. Thus Fs ~ (T^XAr(a)_s_i- 
Now 

Ys - £;[V;] - (s + l)cr2 



Var[y.] j Ys - E\Y,\ 1 s + 1 



2 



zO" 
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Note that from properties of chi-square distribution 



y/Var[Ys_ 



N(0,1). 



Also under our assumption Varfni/iV'"' = 2(iV(') - s- l)a''/iV('' ^ 2cr^. 
This completes the proof. □ 

3.3 Proof of Theorem H 

By assumption Wq*' — n'^' = m and A^'*' = m(l + s) thus from it 
follows that 



m — 1 
/1+/3 = — — 

m(s + 



where = EJli ^ ^i'')' for ^ > 0. Then {W^}^^^ are i.i.d. 
random variables with mean and variance 2(t^. So by standard SLLN 
we get 

T T m — 1 2 

1\ + I3 o" as. 

m 

Now consider 

^' r.^. ^^2^ / log- 



T = ^ 



+ s log s 



where the first inequality follows from Lemma (2] Now 

A(s)= min = min xf.'^-X^"' = min + aZ./V^) -X^"' , 

(23) 

where Zi = (^Xf"' - /Xij /{a/^/m). Note that are i.i.d N (0, 1) 

random variables. From assumption Al and A2 we get 

MO + min - X^"' < A(s) < B + ( min - X^:'^ (24) 



Observe that Xq"^ does not depend on s and from it follows that 
A(s)/V21ogs a l^pm. From (f23|l it now follows that I2 0. 



Finally we consider that 



z— 1 

Let Vis = ^^i.'''' ~ a!"') • From the definition of fji'f' , it follows that 
From Lemma [2] /tg^' < Xg^', so 
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Thus 



(26) 

Let Ao(-i) estimate of no obtained after dropping the popula- 

tion, that is, using only the data (^ofc)i<fc<„, {Xjk\l < k < m}-^^^^^ ,^_^.. 
Recall that Wq*' = n'"' = m and notice that 



So it follows that 



mm 

l<j<s,i^j 



1< < s. 



J=2 



>+x<=))>x(=>} 



(27) 



J=2 



(28) 



Let us denote = 2X|,'' - X^^"' which has N {2fii - /lo, 5crV?Ti) 

distribution. By taking expectation on both sides of equation ((28} we get 



E [V^s] < E 



< E 



- ^J■3 



i=2 



(s-l) 



(29) 



The last inequality holds since i-io < fJ,j < B, for all 1 < j < s by 
assumption A2. 

Now applying the Cauchy-Schwartz inequality on (|29p we get 



E[V^s] < i E 



1 



E 



2(s-l) 



Now notice that E 



(30) 



does not depend on s and (Zf' - S) /cr 



N {y/m{2fii — po — B)/a, 5) distribution. Furthermore, since 2/ii — /^o 
B > fio — B, it follows that 



; m < C^E [{l-<l>(H/)}^(^-^'], (31) 
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where C — » E 



and ~ N {^/m{y,o - B)/cj, 5) distri- 



bution. 

Notice that the right-hand side of (|31|l does not depend on i, so 



i— 1 

Now using the DCT we conclude that E [J4] — >■ as s — > 00. This com- 
pletes the proof. □ 

4 Technical Results on the MLEs of ^ 
and (7^ 

In this section we present some technical results on the constrained MLEs 
of n and which we have used to prove the theorems. Our first result 
gives an easy but very important decomposition of the MLE ct^^j of 
which we refer as fundamental decomposition. The proof of this lemma 
is obvious. So we omit it. 

Lemma 1. The constrained MLE a^^-^ of admits the following decom- 
position 

(si 



+ ]^ E E (^^^ - + ^ E i^i' - • (^2) 

i—l j — 1 i— 1 

Note that the first and the third term in (|32|l do not involve the order 
restricted MLEs of the means. Form our assumptions, A^'''' increases 
strictly with s, so the asymptotic behaviours of these two terms can be 
determined from classical results such as the SLLN and CLT of i.i.d. 
random variables with finite second moment. 

The next result gives two very useful upper and lower bounds on the 
MLE /i^"' of ^0- 

Lemma 2. Let p'"' := sn'''/^"' <^nd A<'' — min i<i<, (^X^'^ -^o"')- 
Then 

^0'' + T^^'°'l{A(=)<o} < f^'o' < (33) 
Proof: Let S — {1,2, . . . ,s}. By definition 



Po = mm r—^ = aA. '4-mm , , 

^cs nW-hn(»)|/| n'"^ ~f n(») |/| 

Suppose A,<'' = Xl'^ - X^"' and A^"' = mini<,<« A|"\ Fix / C 5, J / 
Then 
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Taking minimum on both sides we get: 



...... , , < min , ' . (34) 

Note that the function /(s) = (n'''-'a;c)/(nQ'''+3::rz'-°') is a non-decreasing 
function if c > and non-increasing if c < 0. So in 



mm 

ICS. 



(s) , , l{A(=)<o} > -(If- -7l{A(=)<o}- 

'"Fii rig + j/jnt*' -I- sn'-'') 



Now using the observation that A'"' > 0, = X^"^ the inequality 
follows. □ 
We observe that from Lemma [2] it follows 



1-Hp('') 



A^^'l{A(=)<o} ^ X,^^'-Ar' ^ 0. (35) 



From the above lemma following result follows which we present as a 
stand alone fact. Note that in Theorem 2.5] Chaudhuri and Perlman 
proved an weaker version using a different technique. 

Proposition 1. Suppose n'^'/logs — > oo as s ~^ oo then p,'^^ — ^ /xq 
if and only if Wq*' — > oo. 

Proof: Using similar argument which leads to equation (|24p we con- 
clude that if n'-"^ /logs — > oo then y^^-^A'°'1|^(s)^qj — ^ holds and 
hence from (|35p we get 

The result follows from the fact that Xg, — > fj,Q if and only if rig — >■ oo. 
□ 
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